# Create a numeric vector
list_no <- c(11, 12, 13, 14, 15, 16, 17)
# Keep values less than or equal to 13 OR greater than or equal to 15
list_no_filtered <- list_no[list_no <= 13 | list_no >= 15]
list_no_filtered[1] 11 12 13 15 16 17
Dataframes, Lists, External Files, Paths
Press CMD + A on a Mac to select everything
Press Ctrl + A on a Windows to select everything
Press “Delete” to delete everything
Now replace it with:
This is how we save the document.
Make sure that you save it in the relevant folder.
This is how we render the qmd file into an HTML.
This is how we add a chunk of code.
This is where we only place R code.
Notice how the R code chunk starts
Notice how the R code chunk ends
This is what
“# Introduction”
and
“This is the introduction.”
look like when we render the document
This is how we run a chunk.
This is what our list looks like in the computer’s memory.
This is how we remove one item from the list and save a new list.
Output:
This is how we can do logical subsetting
This is how we can do logical subsetting
# Create a character vector
list_words <- c("random", "word", "sentence", "books")
# List of words to exclude
exclusion_list <- c("word", "sentence", "books")
# Keep only elements not in the exclusion list
list_words_filtered <- list_words[!(list_words %in% exclusion_list)]
list_words_filtered[1] "random"
This how we can check which string is “greater” alphabetically and how to count the number of characters in a word
This is how we can handle missing data.
This is how we calculate the mean and the median
df is a dataframe
student and grade are variables within that dataframe
This is how we select specific observations
This is how we can calculate average for the two variables from the two dataframes
dfsubset_dfThis is how we can calculate average for the two variables from the two dataframes
dfsubset_dfThis is how we can calculate average for the two variables from the two dataframes
dfsubset_dfThis is how we can calculate average for the two variables from the two dataframes
dfsubset_dfThis is how we can calculate average for the two variables from the two dataframes
df has two variables: student and gradeThis is how we can calculate average for the two variables from the two dataframes
df has two variables: student and gradesubset_df has two variables: student and gradeThis is how we calculate the mean
This is how we identify Max & Min
This is how we work with indexing lists
We can easily remove everything from your computer’s memory with the following command:
Notice the difference before and after
Notice the difference before and after
.csv, .xlsx, .txt, or .tsv files| File Type | Description | R Function |
|---|---|---|
.csv |
Comma-separated values | read.csv() |
.tsv |
Tab-separated values | read.delim() |
.txt |
Generic text file | read.table() |
.xlsx |
Excel spreadsheet | readxl::read_excel() |
Download the following datasets from Dropbox:
Now put them in your working directory
Place it in a folder called “data” under “week2”
Now put them in your working directory
Place it in a folder called “data” under “week2”
To open the file add a new chunk and type
This is what you should see
This is what you should see.
The part in red will differ from computer to computer
Notice how the path reflects your folder structure
Notice how the path reflects your folder structure
Notice how the path reflects your folder structure
We can now work with relative paths
Remember this?
We can now work with relative paths
Remember this?
We can now work with relative paths
Remember this?
We can now work with relative paths
Remember this?
We can now work with relative paths
Remember this?
This is how we read the csv file.
This is how we read the csv file.
We can now also load the other dataframe
Notice the difference between relative paths vs. absolute paths
Relative Paths
Absolute Paths
One common error is the following
One common error is the following
If you get that error, your path is not correct
Go back to the previous steps and identify the path to your file.
Let us now investigate our two datasets
Let us now investigate our two datasets
Let us now investigate our two datasets
Let us now investigate our two datasets
This is how we can examine the first five entries
You should see:
This is how we can examine the first five entries
You should see:
This is how you install a packages in R
Notice that you need to have quotes: "tidyverse"
Once you are done:
install.packages("tidyverse")If you don’t delete it or comment it out, it will cause errors during rendering.
Once you install the package, it will always be on your machine.
As we progress, you might be using commands, that might result in:
If you see this error, you need to install the package:
Once the package is installed, comment it out or delete it
To use the commands associated with the package, you need to load it
You will need to load this package to use its functions
glimpseWe will examine our data using glimpse
Rows: 20,445
Columns: 4
$ Entity <chr> "Afghanistan", "Afghanistan", "A…
$ Code <chr> "AFG", "AFG", "AFG", "AFG", "AFG…
$ Year <int> 1950, 1951, 1952, 1953, 1954, 19…
$ Life.expectancy.at.birth..historical. <dbl> 27.7, 28.0, 28.4, 28.9, 29.2, 29…
or
Rows: 20,445
Columns: 4
$ Entity <chr> "Afghanistan", "Afghanistan", "A…
$ Code <chr> "AFG", "AFG", "AFG", "AFG", "AFG…
$ Year <int> 1950, 1951, 1952, 1953, 1954, 19…
$ Life.expectancy.at.birth..historical. <dbl> 27.7, 28.0, 28.4, 28.9, 29.2, 29…
glimpseWe will examine our data using glimpse
Rows: 20,445
Columns: 4
$ Entity <chr> "Afghanistan", "Afghanistan", "A…
$ Code <chr> "AFG", "AFG", "AFG", "AFG", "AFG…
$ Year <int> 1950, 1951, 1952, 1953, 1954, 19…
$ Life.expectancy.at.birth..historical. <dbl> 27.7, 28.0, 28.4, 28.9, 29.2, 29…
We have four variables within our dataframe:
Entity: string or character variableCode: string or character variableYear: numeric or integer variableLife.expectancy.at.birth..historical.: numeric or double precision variableglimpseWe will examine our data using glimpse
Rows: 20,445
Columns: 4
$ Entity <chr> "Afghanistan", "Afghanistan", "A…
$ Code <chr> "AFG", "AFG", "AFG", "AFG", "AFG…
$ Year <int> 1950, 1951, 1952, 1953, 1954, 19…
$ Life.expectancy.at.birth..historical. <dbl> 27.7, 28.0, 28.4, 28.9, 29.2, 29…
We have four variables within our dataframe:
Entity: the country: “Afghanistan”, “Albania”, “Algeria”, etc.Code: the country code: “AFG”, “ALB”, “DZA”, etcYear: year 1950, 1951, 1952, etc.Life.expectancy.at.birth..historical.: life expectancy corresponding to that yearPopescu (JCU): Dataframes, Lists, External Files, Paths